Dataset statistics
| Number of variables | 20 |
|---|---|
| Number of observations | 428 |
| Missing cells | 88 |
| Missing cells (%) | 1.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 67.0 KiB |
| Average record size in memory | 160.3 B |
Variable types
| NUM | 11 |
|---|---|
| BOOL | 8 |
| CAT | 1 |
vehicle_name has a high cardinality: 425 distinct values | High cardinality |
dealer_cost is highly correlated with retail_price | High correlation |
retail_price is highly correlated with dealer_cost | High correlation |
hwy_mpg is highly correlated with city_mpg | High correlation |
city_mpg is highly correlated with hwy_mpg | High correlation |
city_mpg has 15 (3.5%) missing values | Missing |
hwy_mpg has 15 (3.5%) missing values | Missing |
len has 26 (6.1%) missing values | Missing |
width has 28 (6.5%) missing values | Missing |
vehicle_name is uniformly distributed | Uniform |
Reproduction
| Analysis started | 2020-12-04 12:48:40.516332 |
|---|---|
| Analysis finished | 2020-12-04 12:49:03.797543 |
| Duration | 23.28 seconds |
| Software version | pandas-profiling v2.9.0 |
| Download configuration | config.yaml |
| Distinct | 425 |
|---|---|
| Distinct (%) | 99.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.3 KiB |
| Mercedes-Benz C320 4dr | 2 |
|---|---|
| Infiniti G35 4dr | 2 |
| Mercedes-Benz C240 4dr | 2 |
| Ford F-150 Regular Cab XL | 1 |
| Kia Amanti 4dr | 1 |
| Other values (420) |
| Value | Count | Frequency (%) | |
| Mercedes-Benz C320 4dr | 2 | 0.5% | |
| Infiniti G35 4dr | 2 | 0.5% | |
| Mercedes-Benz C240 4dr | 2 | 0.5% | |
| Ford F-150 Regular Cab XL | 1 | 0.2% | |
| Kia Amanti 4dr | 1 | 0.2% | |
| Honda Pilot LX | 1 | 0.2% | |
| Mazda B4000 SE Cab Plus | 1 | 0.2% | |
| Oldsmobile Silhouette GL | 1 | 0.2% | |
| Toyota Corolla S 4dr | 1 | 0.2% | |
| Saab 9-5 Aero | 1 | 0.2% | |
| Other values (415) | 415 | 97.0% |
Frequencies of value counts
Unique
| Unique | 422 ? |
|---|---|
| Unique (%) | 98.6% |
Histogram of lengths of the category
Length
| Max length | 45 |
|---|---|
| Median length | 21 |
| Mean length | 21.94392523 |
| Min length | 8 |
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.3 KiB |
| 1 | |
|---|---|
| 0 |
| Value | Count | Frequency (%) | |
| 1 | 244 | 57.0% | |
| 0 | 184 | 43.0% |
sports_car
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.3 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 379 | 88.6% | |
| 1 | 49 | 11.4% |
suv
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.3 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 368 | 86.0% | |
| 1 | 60 | 14.0% |
wagon
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.3 KiB |
| 0 | |
|---|---|
| 1 | 30 |
| Value | Count | Frequency (%) | |
| 0 | 398 | 93.0% | |
| 1 | 30 | 7.0% |
minivan
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.3 KiB |
| 0 | |
|---|---|
| 1 | 20 |
| Value | Count | Frequency (%) | |
| 0 | 408 | 95.3% | |
| 1 | 20 | 4.7% |
pickup
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.3 KiB |
| 0 | |
|---|---|
| 1 | 24 |
| Value | Count | Frequency (%) | |
| 0 | 404 | 94.4% | |
| 1 | 24 | 5.6% |
awd
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.3 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 336 | 78.5% | |
| 1 | 92 | 21.5% |
rwd
Boolean
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.3 KiB |
| 0 | |
|---|---|
| 1 |
| Value | Count | Frequency (%) | |
| 0 | 318 | 74.3% | |
| 1 | 110 | 25.7% |
| Distinct | 410 |
|---|---|
| Distinct (%) | 95.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 32774.85514 |
|---|---|
| Minimum | 10280 |
| Maximum | 192465 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.3 KiB |
Quantile statistics
| Minimum | 10280 |
|---|---|
| 5-th percentile | 13691 |
| Q1 | 20334.25 |
| median | 27635 |
| Q3 | 39205 |
| 95-th percentile | 72864.25 |
| Maximum | 192465 |
| Range | 182185 |
| Interquartile range (IQR) | 18870.75 |
Descriptive statistics
| Standard deviation | 19431.71667 |
|---|---|
| Coefficient of variation (CV) | 0.5928848988 |
| Kurtosis | 13.87920552 |
| Mean | 32774.85514 |
| Median Absolute Deviation (MAD) | 8314 |
| Skewness | 2.798099275 |
| Sum | 14027638 |
| Variance | 377591612.9 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 19860 | 2 | 0.5% | |
| 15389 | 2 | 0.5% | |
| 31545 | 2 | 0.5% | |
| 33995 | 2 | 0.5% | |
| 19635 | 2 | 0.5% | |
| 21595 | 2 | 0.5% | |
| 49995 | 2 | 0.5% | |
| 29995 | 2 | 0.5% | |
| 35940 | 2 | 0.5% | |
| 25700 | 2 | 0.5% | |
| Other values (400) | 408 | 95.3% |
| Value | Count | Frequency (%) | |
| 10280 | 1 | 0.2% | |
| 10539 | 1 | 0.2% | |
| 10760 | 1 | 0.2% | |
| 10995 | 1 | 0.2% | |
| 11155 | 1 | 0.2% |
| Value | Count | Frequency (%) | |
| 192465 | 1 | 0.2% | |
| 128420 | 1 | 0.2% | |
| 126670 | 1 | 0.2% | |
| 121770 | 1 | 0.2% | |
| 94820 | 1 | 0.2% |
| Distinct | 425 |
|---|---|
| Distinct (%) | 99.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 30014.70093 |
|---|---|
| Minimum | 9875 |
| Maximum | 173560 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.3 KiB |
Quantile statistics
| Minimum | 9875 |
|---|---|
| 5-th percentile | 12836.65 |
| Q1 | 18866 |
| median | 25294.5 |
| Q3 | 35710.25 |
| 95-th percentile | 66471.95 |
| Maximum | 173560 |
| Range | 163685 |
| Interquartile range (IQR) | 16844.25 |
Descriptive statistics
| Standard deviation | 17642.11775 |
|---|---|
| Coefficient of variation (CV) | 0.5877825599 |
| Kurtosis | 13.94616377 |
| Mean | 30014.70093 |
| Median Absolute Deviation (MAD) | 7531 |
| Skewness | 2.834740404 |
| Sum | 12846292 |
| Variance | 311244318.7 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 19638 | 2 | 0.5% | |
| 14207 | 2 | 0.5% | |
| 68306 | 2 | 0.5% | |
| 37886 | 1 | 0.2% | |
| 24926 | 1 | 0.2% | |
| 21198 | 1 | 0.2% | |
| 23883 | 1 | 0.2% | |
| 24909 | 1 | 0.2% | |
| 13650 | 1 | 0.2% | |
| 24915 | 1 | 0.2% | |
| Other values (415) | 415 | 97.0% |
| Value | Count | Frequency (%) | |
| 9875 | 1 | 0.2% | |
| 10107 | 1 | 0.2% | |
| 10144 | 1 | 0.2% | |
| 10319 | 1 | 0.2% | |
| 10642 | 1 | 0.2% |
| Value | Count | Frequency (%) | |
| 173560 | 1 | 0.2% | |
| 119600 | 1 | 0.2% | |
| 117854 | 1 | 0.2% | |
| 113388 | 1 | 0.2% | |
| 88324 | 1 | 0.2% |
engine_size_(l)
Real number (ℝ≥0)
| Distinct | 43 |
|---|---|
| Distinct (%) | 10.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.196728972 |
|---|---|
| Minimum | 1.3 |
| Maximum | 8.3 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.3 KiB |
Quantile statistics
| Minimum | 1.3 |
|---|---|
| 5-th percentile | 1.7 |
| Q1 | 2.375 |
| median | 3 |
| Q3 | 3.9 |
| 95-th percentile | 5.3 |
| Maximum | 8.3 |
| Range | 7 |
| Interquartile range (IQR) | 1.525 |
Descriptive statistics
| Standard deviation | 1.108594718 |
|---|---|
| Coefficient of variation (CV) | 0.3467903373 |
| Kurtosis | 0.5419435378 |
| Mean | 3.196728972 |
| Median Absolute Deviation (MAD) | 0.8 |
| Skewness | 0.7081519825 |
| Sum | 1368.2 |
| Variance | 1.22898225 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=43)
| Value | Count | Frequency (%) | |
| 3 | 42 | 9.8% | |
| 3.5 | 34 | 7.9% | |
| 2 | 30 | 7.0% | |
| 2.5 | 26 | 6.1% | |
| 2.4 | 23 | 5.4% | |
| 1.8 | 23 | 5.4% | |
| 4.6 | 21 | 4.9% | |
| 4.2 | 20 | 4.7% | |
| 3.2 | 18 | 4.2% | |
| 3.8 | 17 | 4.0% | |
| Other values (33) | 174 | 40.7% |
| Value | Count | Frequency (%) | |
| 1.3 | 2 | 0.5% | |
| 1.4 | 1 | 0.2% | |
| 1.5 | 6 | 1.4% | |
| 1.6 | 10 | 2.3% | |
| 1.7 | 4 | 0.9% |
| Value | Count | Frequency (%) | |
| 8.3 | 1 | 0.2% | |
| 6.8 | 1 | 0.2% | |
| 6 | 6 | 1.4% | |
| 5.7 | 3 | 0.7% | |
| 5.6 | 2 | 0.5% |
cyl
Real number (ℝ)
| Distinct | 8 |
|---|---|
| Distinct (%) | 1.9% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.775700935 |
|---|---|
| Minimum | -1 |
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.3 KiB |
Quantile statistics
| Minimum | -1 |
|---|---|
| 5-th percentile | 4 |
| Q1 | 4 |
| median | 6 |
| Q3 | 6 |
| 95-th percentile | 8 |
| Maximum | 12 |
| Range | 13 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.622779362 |
|---|---|
| Coefficient of variation (CV) | 0.2809666532 |
| Kurtosis | 1.396548909 |
| Mean | 5.775700935 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.2342651493 |
| Sum | 2472 |
| Variance | 2.633412856 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=8)
| Value | Count | Frequency (%) | |
| 6 | 190 | 44.4% | |
| 4 | 136 | 31.8% | |
| 8 | 87 | 20.3% | |
| 5 | 7 | 1.6% | |
| 12 | 3 | 0.7% | |
| 10 | 2 | 0.5% | |
| -1 | 2 | 0.5% | |
| 3 | 1 | 0.2% |
| Value | Count | Frequency (%) | |
| -1 | 2 | 0.5% | |
| 3 | 1 | 0.2% | |
| 4 | 136 | 31.8% | |
| 5 | 7 | 1.6% | |
| 6 | 190 | 44.4% |
| Value | Count | Frequency (%) | |
| 12 | 3 | 0.7% | |
| 10 | 2 | 0.5% | |
| 8 | 87 | 20.3% | |
| 6 | 190 | 44.4% | |
| 5 | 7 | 1.6% |
hp
Real number (ℝ≥0)
| Distinct | 110 |
|---|---|
| Distinct (%) | 25.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 215.885514 |
|---|---|
| Minimum | 73 |
| Maximum | 500 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.3 KiB |
Quantile statistics
| Minimum | 73 |
|---|---|
| 5-th percentile | 115 |
| Q1 | 165 |
| median | 210 |
| Q3 | 255 |
| 95-th percentile | 338.25 |
| Maximum | 500 |
| Range | 427 |
| Interquartile range (IQR) | 90 |
Descriptive statistics
| Standard deviation | 71.83603158 |
|---|---|
| Coefficient of variation (CV) | 0.3327505873 |
| Kurtosis | 1.552158629 |
| Mean | 215.885514 |
| Median Absolute Deviation (MAD) | 45 |
| Skewness | 0.9303307363 |
| Sum | 92399 |
| Variance | 5160.415434 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 200 | 17 | 4.0% | |
| 210 | 14 | 3.3% | |
| 215 | 14 | 3.3% | |
| 225 | 13 | 3.0% | |
| 240 | 13 | 3.0% | |
| 220 | 12 | 2.8% | |
| 140 | 12 | 2.8% | |
| 300 | 11 | 2.6% | |
| 170 | 11 | 2.6% | |
| 130 | 10 | 2.3% | |
| Other values (100) | 301 | 70.3% |
| Value | Count | Frequency (%) | |
| 73 | 1 | 0.2% | |
| 93 | 1 | 0.2% | |
| 100 | 1 | 0.2% | |
| 103 | 5 | 1.2% | |
| 104 | 3 | 0.7% |
| Value | Count | Frequency (%) | |
| 500 | 1 | 0.2% | |
| 493 | 3 | 0.7% | |
| 477 | 1 | 0.2% | |
| 450 | 1 | 0.2% | |
| 420 | 1 | 0.2% |
| Distinct | 28 |
|---|---|
| Distinct (%) | 6.8% |
| Missing | 15 |
| Missing (%) | 3.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 20.08958838 |
|---|---|
| Minimum | 10 |
| Maximum | 60 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.3 KiB |
Quantile statistics
| Minimum | 10 |
|---|---|
| 5-th percentile | 14 |
| Q1 | 17 |
| median | 19 |
| Q3 | 21 |
| 95-th percentile | 29 |
| Maximum | 60 |
| Range | 50 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 5.219382573 |
|---|---|
| Coefficient of variation (CV) | 0.2598053516 |
| Kurtosis | 16.61615357 |
| Mean | 20.08958838 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 2.928557209 |
| Sum | 8297 |
| Variance | 27.24195444 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=28)
| Value | Count | Frequency (%) | |
| 18 | 68 | 15.9% | |
| 20 | 56 | 13.1% | |
| 17 | 40 | 9.3% | |
| 21 | 38 | 8.9% | |
| 19 | 38 | 8.9% | |
| 16 | 29 | 6.8% | |
| 24 | 22 | 5.1% | |
| 26 | 21 | 4.9% | |
| 22 | 18 | 4.2% | |
| 15 | 17 | 4.0% | |
| Other values (18) | 66 | 15.4% | |
| (Missing) | 15 | 3.5% |
| Value | Count | Frequency (%) | |
| 10 | 1 | 0.2% | |
| 12 | 2 | 0.5% | |
| 13 | 11 | 2.6% | |
| 14 | 13 | 3.0% | |
| 15 | 17 | 4.0% |
| Value | Count | Frequency (%) | |
| 60 | 1 | 0.2% | |
| 59 | 1 | 0.2% | |
| 46 | 1 | 0.2% | |
| 38 | 1 | 0.2% | |
| 36 | 1 | 0.2% |
| Distinct | 32 |
|---|---|
| Distinct (%) | 7.7% |
| Missing | 15 |
| Missing (%) | 3.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.90556901 |
|---|---|
| Minimum | 12 |
| Maximum | 66 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.3 KiB |
Quantile statistics
| Minimum | 12 |
|---|---|
| 5-th percentile | 18 |
| Q1 | 24 |
| median | 26 |
| Q3 | 29 |
| 95-th percentile | 36 |
| Maximum | 66 |
| Range | 54 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 5.70371136 |
|---|---|
| Coefficient of variation (CV) | 0.2119899921 |
| Kurtosis | 6.425357238 |
| Mean | 26.90556901 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 1.350295982 |
| Sum | 11112 |
| Variance | 32.53232328 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=32)
| Value | Count | Frequency (%) | |
| 26 | 54 | 12.6% | |
| 25 | 43 | 10.0% | |
| 28 | 38 | 8.9% | |
| 29 | 34 | 7.9% | |
| 27 | 27 | 6.3% | |
| 24 | 26 | 6.1% | |
| 30 | 23 | 5.4% | |
| 23 | 16 | 3.7% | |
| 21 | 16 | 3.7% | |
| 19 | 15 | 3.5% | |
| Other values (22) | 121 | 28.3% | |
| (Missing) | 15 | 3.5% |
| Value | Count | Frequency (%) | |
| 12 | 1 | 0.2% | |
| 14 | 1 | 0.2% | |
| 16 | 2 | 0.5% | |
| 17 | 9 | 2.1% | |
| 18 | 9 | 2.1% |
| Value | Count | Frequency (%) | |
| 66 | 1 | 0.2% | |
| 51 | 2 | 0.5% | |
| 46 | 1 | 0.2% | |
| 44 | 1 | 0.2% | |
| 43 | 2 | 0.5% |
weight
Real number (ℝ≥0)
| Distinct | 347 |
|---|---|
| Distinct (%) | 81.5% |
| Missing | 2 |
| Missing (%) | 0.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3577.213615 |
|---|---|
| Minimum | 1850 |
| Maximum | 7190 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.3 KiB |
Quantile statistics
| Minimum | 1850 |
|---|---|
| 5-th percentile | 2513 |
| Q1 | 3102 |
| median | 3474.5 |
| Q3 | 3974.25 |
| 95-th percentile | 4996.75 |
| Maximum | 7190 |
| Range | 5340 |
| Interquartile range (IQR) | 872.25 |
Descriptive statistics
| Standard deviation | 760.4376628 |
|---|---|
| Coefficient of variation (CV) | 0.2125782088 |
| Kurtosis | 1.678289561 |
| Mean | 3577.213615 |
| Median Absolute Deviation (MAD) | 428 |
| Skewness | 0.8933847105 |
| Sum | 1523893 |
| Variance | 578265.439 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 3285 | 4 | 0.9% | |
| 3175 | 4 | 0.9% | |
| 3470 | 3 | 0.7% | |
| 4024 | 3 | 0.7% | |
| 3217 | 3 | 0.7% | |
| 2676 | 3 | 0.7% | |
| 4052 | 3 | 0.7% | |
| 2692 | 3 | 0.7% | |
| 3430 | 3 | 0.7% | |
| 3428 | 3 | 0.7% | |
| Other values (337) | 394 | 92.1% |
| Value | Count | Frequency (%) | |
| 1850 | 1 | 0.2% | |
| 2035 | 1 | 0.2% | |
| 2055 | 1 | 0.2% | |
| 2085 | 1 | 0.2% | |
| 2195 | 1 | 0.2% |
| Value | Count | Frequency (%) | |
| 7190 | 1 | 0.2% | |
| 6400 | 1 | 0.2% | |
| 6133 | 1 | 0.2% | |
| 5969 | 1 | 0.2% | |
| 5879 | 1 | 0.2% |
wheel_base
Real number (ℝ≥0)
| Distinct | 40 |
|---|---|
| Distinct (%) | 9.4% |
| Missing | 2 |
| Missing (%) | 0.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 108.1737089 |
|---|---|
| Minimum | 89 |
| Maximum | 144 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.3 KiB |
Quantile statistics
| Minimum | 89 |
|---|---|
| 5-th percentile | 95.25 |
| Q1 | 103 |
| median | 107 |
| Q3 | 112 |
| 95-th percentile | 123 |
| Maximum | 144 |
| Range | 55 |
| Interquartile range (IQR) | 9 |
Descriptive statistics
| Standard deviation | 8.326449076 |
|---|---|
| Coefficient of variation (CV) | 0.07697294619 |
| Kurtosis | 2.112464038 |
| Mean | 108.1737089 |
| Median Absolute Deviation (MAD) | 5 |
| Skewness | 0.9552742051 |
| Sum | 46082 |
| Variance | 69.32975421 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=40)
| Value | Count | Frequency (%) | |
| 107 | 45 | 10.5% | |
| 103 | 30 | 7.0% | |
| 106 | 27 | 6.3% | |
| 112 | 25 | 5.8% | |
| 104 | 22 | 5.1% | |
| 105 | 21 | 4.9% | |
| 115 | 20 | 4.7% | |
| 111 | 17 | 4.0% | |
| 109 | 17 | 4.0% | |
| 101 | 16 | 3.7% | |
| Other values (30) | 186 | 43.5% |
| Value | Count | Frequency (%) | |
| 89 | 2 | 0.5% | |
| 93 | 9 | 2.1% | |
| 95 | 11 | 2.6% | |
| 96 | 5 | 1.2% | |
| 97 | 3 | 0.7% |
| Value | Count | Frequency (%) | |
| 144 | 2 | 0.5% | |
| 140 | 1 | 0.2% | |
| 137 | 1 | 0.2% | |
| 133 | 2 | 0.5% | |
| 131 | 1 | 0.2% |
| Distinct | 61 |
|---|---|
| Distinct (%) | 15.2% |
| Missing | 26 |
| Missing (%) | 6.1% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 185.1268657 |
|---|---|
| Minimum | 143 |
| Maximum | 227 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.3 KiB |
Quantile statistics
| Minimum | 143 |
|---|---|
| 5-th percentile | 162.05 |
| Q1 | 177 |
| median | 186 |
| Q3 | 193 |
| 95-th percentile | 207 |
| Maximum | 227 |
| Range | 84 |
| Interquartile range (IQR) | 16 |
Descriptive statistics
| Standard deviation | 13.31252292 |
|---|---|
| Coefficient of variation (CV) | 0.07191027013 |
| Kurtosis | 0.3112242615 |
| Mean | 185.1268657 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | -0.09622117289 |
| Sum | 74421 |
| Variance | 177.2232665 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 178 | 26 | 6.1% | |
| 190 | 22 | 5.1% | |
| 187 | 17 | 4.0% | |
| 192 | 14 | 3.3% | |
| 200 | 13 | 3.0% | |
| 177 | 13 | 3.0% | |
| 188 | 13 | 3.0% | |
| 179 | 13 | 3.0% | |
| 175 | 12 | 2.8% | |
| 183 | 12 | 2.8% | |
| Other values (51) | 247 | 57.7% | |
| (Missing) | 26 | 6.1% |
| Value | Count | Frequency (%) | |
| 143 | 1 | 0.2% | |
| 144 | 1 | 0.2% | |
| 150 | 1 | 0.2% | |
| 153 | 2 | 0.5% | |
| 154 | 1 | 0.2% |
| Value | Count | Frequency (%) | |
| 227 | 1 | 0.2% | |
| 221 | 1 | 0.2% | |
| 219 | 2 | 0.5% | |
| 215 | 2 | 0.5% | |
| 212 | 7 | 1.6% |
| Distinct | 18 |
|---|---|
| Distinct (%) | 4.5% |
| Missing | 28 |
| Missing (%) | 6.5% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 71.2925 |
|---|---|
| Minimum | 64 |
| Maximum | 81 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 3.3 KiB |
Quantile statistics
| Minimum | 64 |
|---|---|
| 5-th percentile | 67 |
| Q1 | 69 |
| median | 71 |
| Q3 | 73 |
| 95-th percentile | 78 |
| Maximum | 81 |
| Range | 17 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 3.393483915 |
|---|---|
| Coefficient of variation (CV) | 0.04759945177 |
| Kurtosis | -0.2123582009 |
| Mean | 71.2925 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.5607116725 |
| Sum | 28517 |
| Variance | 11.51573308 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=18)
| Value | Count | Frequency (%) | |
| 72 | 55 | 12.9% | |
| 68 | 47 | 11.0% | |
| 69 | 42 | 9.8% | |
| 73 | 42 | 9.8% | |
| 70 | 41 | 9.6% | |
| 71 | 36 | 8.4% | |
| 67 | 36 | 8.4% | |
| 74 | 21 | 4.9% | |
| 75 | 17 | 4.0% | |
| 78 | 16 | 3.7% | |
| Other values (8) | 47 | 11.0% | |
| (Missing) | 28 | 6.5% |
| Value | Count | Frequency (%) | |
| 64 | 1 | 0.2% | |
| 65 | 3 | 0.7% | |
| 66 | 10 | 2.3% | |
| 67 | 36 | 8.4% | |
| 68 | 47 | 11.0% |
| Value | Count | Frequency (%) | |
| 81 | 1 | 0.2% | |
| 80 | 2 | 0.5% | |
| 79 | 12 | 2.8% | |
| 78 | 16 | 3.7% | |
| 77 | 6 | 1.4% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| vehicle_name | smallsporty_compactlarge_sedan | sports_car | suv | wagon | minivan | pickup | awd | rwd | retail_price | dealer_cost | engine_size_(l) | cyl | hp | city_mpg | hwy_mpg | weight | wheel_base | len | width | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Acura 3.5 RL 4dr | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 43755 | 39014 | 3.5 | 6 | 225 | 18.0 | 24.0 | 3880.0 | 115.0 | 197.0 | 72.0 |
| 1 | Acura 3.5 RL w/Navigation 4dr | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 46100 | 41100 | 3.5 | 6 | 225 | 18.0 | 24.0 | 3893.0 | 115.0 | 197.0 | 72.0 |
| 2 | Acura MDX | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 36945 | 33337 | 3.5 | 6 | 265 | 17.0 | 23.0 | 4451.0 | 106.0 | 189.0 | 77.0 |
| 3 | Acura NSX coupe 2dr manual S | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 89765 | 79978 | 3.2 | 6 | 290 | 17.0 | 24.0 | 3153.0 | 100.0 | 174.0 | 71.0 |
| 4 | Acura RSX Type S 2dr | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 23820 | 21761 | 2.0 | 4 | 200 | 24.0 | 31.0 | 2778.0 | 101.0 | 172.0 | 68.0 |
| 5 | Acura TL 4dr | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 33195 | 30299 | 3.2 | 6 | 270 | 20.0 | 28.0 | 3575.0 | 108.0 | 186.0 | 72.0 |
| 6 | Acura TSX 4dr | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 26990 | 24647 | 2.4 | 4 | 200 | 22.0 | 29.0 | 3230.0 | 105.0 | 183.0 | 69.0 |
| 7 | Audi A4 1.8T 4dr | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 25940 | 23508 | 1.8 | 4 | 170 | 22.0 | 31.0 | 3252.0 | 104.0 | 179.0 | 70.0 |
| 8 | Audi A4 3.0 4dr | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 31840 | 28846 | 3.0 | 6 | 220 | 20.0 | 28.0 | 3462.0 | 104.0 | 179.0 | 70.0 |
| 9 | Audi A4 3.0 convertible 2dr | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 42490 | 38325 | 3.0 | 6 | 220 | 20.0 | 27.0 | 3814.0 | 105.0 | 180.0 | 70.0 |
Last rows
| vehicle_name | smallsporty_compactlarge_sedan | sports_car | suv | wagon | minivan | pickup | awd | rwd | retail_price | dealer_cost | engine_size_(l) | cyl | hp | city_mpg | hwy_mpg | weight | wheel_base | len | width | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 418 | Volvo S40 4dr | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 25135 | 23701 | 1.9 | 4 | 170 | 22.0 | 29.0 | 2767.0 | 101.0 | 178.0 | 68.0 |
| 419 | Volvo S60 2.5 4dr | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 31745 | 29916 | 2.5 | 5 | 208 | 20.0 | 27.0 | 3903.0 | 107.0 | 180.0 | 71.0 |
| 420 | Volvo S60 R 4dr | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 37560 | 35382 | 2.5 | 5 | 300 | 18.0 | 25.0 | 3571.0 | 107.0 | 181.0 | 71.0 |
| 421 | Volvo S60 T5 4dr | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 34845 | 32902 | 2.3 | 5 | 247 | 20.0 | 28.0 | 3766.0 | 107.0 | 180.0 | 71.0 |
| 422 | Volvo S80 2.5T 4dr | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 37885 | 35688 | 2.5 | 5 | 194 | 20.0 | 27.0 | 3691.0 | 110.0 | 190.0 | 72.0 |
| 423 | Volvo S80 2.9 4dr | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 37730 | 35542 | 2.9 | 6 | 208 | 20.0 | 28.0 | 3576.0 | 110.0 | 190.0 | 72.0 |
| 424 | Volvo S80 T6 4dr | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 45210 | 42573 | 2.9 | 6 | 268 | 19.0 | 26.0 | 3653.0 | 110.0 | 190.0 | 72.0 |
| 425 | Volvo V40 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 26135 | 24641 | 1.9 | 4 | 170 | 22.0 | 29.0 | 2822.0 | 101.0 | 180.0 | 68.0 |
| 426 | Volvo XC70 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 35145 | 33112 | 2.5 | 5 | 208 | NaN | NaN | 3823.0 | 109.0 | 186.0 | 73.0 |
| 427 | Volvo XC90 T6 | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 41250 | 38851 | 2.9 | 6 | 268 | 15.0 | 20.0 | 4638.0 | 113.0 | 189.0 | 75.0 |